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We present a lower bound for the free energy of a quantum many-body system at finite tempera- 
ture. This lower bound is expressed as a convex optimization problem with linear constraints, and is 
derived using strong subadditivity of von Neumann entropy and a relaxation of the consistency con- 
dition of local density operators. The dual to this minimization problem leads to a set of quantum 
belief propagation equations, thus providing a firm theoretical foundation to that approach. The 
minimization problem is numerically tractable, and we find good agreement with quantum Monte 
Carlo for the spin-| Heisenberg anti-ferromagnet in two dimensions. This lower bound complements 
other variational upper bounds. We discuss applications to Hamiltonian complexity theory and give 
a generalization of the structure theorem of [16] to trees in an appendix. 
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Describing the properties of a local quantum system is 
perhaps the central problem of theoretical physics. How- 
ever, the exponential growth of the Hilbert space with 
system size makes it prohibitive to even write down the 
state of a system with even a modest number of sites. For 
this reason, variational methods, such as matrix product 
states used in DMRG PJ5] and their higher dimensional 
generalizations 5,6, are a central tool, describing a state 
with a small number of parameters, allowing a practical 
optimization of the energy. 

All these methods provide an upper bound to the free 
energy and the quality of the approximation cannot be 
assessed directly. In this Letter, we present a lower bound 
to the free energy that nicely complements variational ap- 
proaches. We use strong subadditivity (SSA) of von Neu- 
mann entropy [7] to approximate the system's entropy 
by a local quantity. This approximation is exact when 
the system is a Markov network [5] — i.e., when its long- 
range correlations arise due to correlations over shorter 
distances — but in general provides a lower bound to the 
true entropy. By relaxing the consistency constraints on 
the reduced density operators of the systems, we find a 
formula for the free energy expressed as a convex mini- 
mization problem with linear constraints. 

Our formula for the free energy is similar to the Bethe 
free energy [5] — and its generalization by Kikuchi [TU] — , 
but differs by a crucial ordering of the lattice sites. This 
distinction is responsible for the lower bound obtained by 
our method, in contrast to Bethc's and Kikuchi's approx- 
imations which are uncontrolled. The dual of the mini- 
mization problem provides a set of quantum belief propa- 
gation equations similar to those presented in [TT] [T^] . 
This connection provides a solid theoretical foundation to 
understand the success and limitations of quantum be- 
lief propagation. Similar connections [IS] and algorithms 
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[14] have been found in the classical setting. 
Markov entropy decomposition — Consider a lattice of N 
spins that we label from 1 to N. The labeling of the sites 
chosen will determine the order in which we apply our 
procedure later. The Hamiltonian of the system is a sum 
of geometrically local terms H = J2x where X la- 
bels subsets of {1, . . . N} and locality means that hx = 
when the radius of X is larger than some constant w. 
Given the density matrix p of the system, we can compute 
the average energy E(p) = Ti(pH) = Ex Tr (/°xfrx) 
from knowledge of only the reduced density matrices 
px = Try/9 on small local regions, that can be obtained 
from the partial trace of p over the complement X of X. 

At finite temperature T, we are interested in the sys- 
tem's free energy F(T) = mm p {E(p) - TS(p)}. Un- 
like the energy, the entropy S(p) = Tr(plogp) cannot 
be evaluated in general from knowledge of only the re- 
duced density matrices px over regions X of finite ra- 
dius. We define an approximate way of doing this eval- 
uation. For every site k, define a subset of sites A4 con- 
sisting of "neighboring" sites. There is no unique pre- 
scription for the choice of A4, but it is useful to imag- 
ine that they consist of a set of sites located within a 
finite distance from k. With trivial manipulations, we 
can rewrite the entropy of the system in the form of 
an "entropy chain rule" S(p) — Ylk=i &(k\{< ^}) wnere 
the conditional entropy of a region X given region Y is 
S(X\Y) = S(X U Y) - S(Y), the entropy of any region 
X is denoted S(X) = S(px) = — Tr(px log px), and we 
use the notation {< k} = {1, 2, . . . , k — 1}. 

Quantum entropy S obeys SSA[7], which implies the 
bound 

s(k\{< k}) < s(k\{< k} n A4) = s{k\M k ), (l) 

where we define Mk = {< k} n A4- We call Mr the 
"Markov shield" of site k, see Fig. [TJ We can define the 

Markov entropy Sm(p) = Sfc=i S(k\Adk) which upper 
bounds the system's entropy. Because each term in that 



FIG. 1: (Color online) a) The Marrkov shield (shown in blue) 
Mh is the intersection of the neighborhood (green) of k and 
the sites preceding k (orange), b) The entanglement (repre- 
sented by black lines) between site k and the preceding sites 
is all mediated by the Markov shield: the state of the first 
k sites can be constructed by adding one extra spin to the 
state of the first k — 1 site and coupling it only to the sites of 
the shield [16]. This turns inequality Eq. |l]) into an equality, 
c) There is direct entanglement between site k and the sites 
preceding k, so the Markov entropy is not equal to the true 
entropy, but it is an upper bound. 



sum can be computed from the reduced density matri- 
ces on site k and its Markov shield, the Markov entropy, 
unlike the entropy S, is suitable for direct numerical cal- 
culations. 

Returning to the free energy calculation, we now have 
the bound F(T) > F M {T) = mvsx p {E{p)-TS M {p)}- The 
Markov free energy Fm of any given state is equal to its 
true free energy if SSA is saturated with the given choice 
of Markov shields as shown in Fig.[T] Because both E and 
Sm can be evaluated from the density matrix of constant- 
size regions X, we can express Fm(p) = E(p) — TSm(p) 
as a function of some set of reduced density operators 
{px} and write F M {T) = min {px}e0 F M ({p x }), where 
f2 denotes the set of consistent reduced density matrices 
that are all obtainable from some global density matrix 
p, i.e. n ee {{p x } ■ 3p, p x = Tr x p, VX}. 

Unfortunately, verifying consistency of a set of reduced 
density matrices {px} is a difficult problem, it is QMA- 
complete [15], so it is very unlikely that f2 can be char- 
acterized efficiently Thus, we will make one more ap- 
proximation and enlarge the set fi to the set f2 of all lo- 
cally consistent reduced density matrices that agree on 
overlapping regions, i.e. f2 ee {{p x } ■ Tr^^y-px = 
Tt xnY py , V(X, Y)j. Since all reduced density matri- 
ces in f2 are derived from one global p, it should be clear 
that n C fi, and as a consequence 

^med(T)ee min F M ({p x }) < F M (T) < F(T). (2) 

Equation ^ defines our numerical method that we 
call the Markov entropy decomposition (MED) scheme. 
The Markov free energy Fm({px}) to be minimized to 
evaluate Fmed{T) is a convex function 1 over the cone 
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FIG. 2: Numerical results obtained from MED for the spin- 
| Heisenberg antiferromagnet on a 2D square lattice. The 
energy (green) and free energy (blue) are obtained for a 7- 
and 10-site Markov shield, of shape illustrated in the upper 
left corner. Results are compared to exact diagonalization of 
a 4 x 4 lattice and quantum Monte Carlo. The crossing of 
energy and free energy curve (negative entropy, 7-site shield) 
provides a lower bound to the ground energy. 

of semi-positive operators {px} subject to some linear 
constraints specified in the definition of tt. Thus, it is 
suitable for numerical optimization. 
Numerical results on translationally invariant systems — 
The procedure simplifies greatly when applied to transla- 
tionally invariant systems. If we assume that all density 
matrices px are related by translational symmetry, the 
Markov free energy is a function of a single density ma- 
trix. We have numerically investigated this method with 
a spin-i antiferromagnetic Heisenberg model on an in- 
finite two-dimensional square lattice. We have used a 
Markov shield of size 7 and 10, so that the main compu- 
tational task of our program was exact diagonalization 
of (non-sparse) matrices of size 2 s and 2 11 respectively. 
Figure [2] compares our results to other methods. 

The MED free energy with the 10-site shield is in excel- 
lent agreement with quantum Monte Carlo for the entire 
temperature range. This agreement with QMC is better 
than the one obtained from exact diagonalization (ED) 
of a 4 x 4 lattice. In fact, those diagonalization results 
are very well approximated by MED with a 7-site shield. 
Here we see the biggest advantage of MED: because of 
the constraints imposed on the minimization, the results 
converge to the thermodynamic limit faster than ED. 

Since entropy is positive and dF/dT = —S, we see that 
the free energy is a monotonically decreasing function of 
temperature. However, the Markov entropy Sm ({px}) 
can be negative when the global consistency is not sat- 
isfied, and we indeed observe that the slope of the free 
energy changes sign near T = 0.2. Markov entropy be- 
comes negative where F M (T) = E M (T) . Since F M (T) < 
E(p(0)) - TS M (p{0)) < E , the crossing point of the 
MED energy and free energy obtained with the 7-site 
shield gives a lower bound E > —0.7062... to the true 
ground state energy of the system. 
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We have used this technique to lower bound the ground 
state energy of the one-dimensional model. Results ob- 
tained with a fc-site neighborhood are in good agreement 
with ED results on a chain of length roughly 2k (with 
periodic boundary conditions). This can be understood 
from the fact that the ground-state entropy of a block 
of £ sites, S(£), is an increasing function of £ for I < fc, 
and then decreases to reach when I = 2k since the en- 
tire system is in a pure state. Thus, enforcing a positive 
Markov entropy density Sm — S(k) — S(k — 1) compels 
the system in our simulations to behave as it were on a 
lattice of size 2k, even though we are manipulating states 
of k spins, providing some heuristic explanation for the 
improved convergence, compared to ED, seen above. 

All these lower bounds on the ground state energy and 
the lower bounds on the free energy, would be rigorous 
if the convex optimization problem were solved exactly. 
However, all our results are subject to numerical error. 
We used fairly elementary minimization methods (con- 
jugate gradient) and more elaborate techniques that ex- 
ploit the special features of this problem are likely to 
improve the results; we hope that this Letter will stim- 
ulate research in this direction. Numerical fluctuations 
are most prominent in the energy, while the free energy 
curve is rather smooth. The fluctuations are largest near 
the specific heat peak; to understand this, consider the 
free energy E — TSm as a function of E, assuming for 
simplicity that Sm equals the correct entropy S(E). At 

a minimum of F, §^ =0, and = ^ and so for large 
c, the basin around the minimum is shallow, increasing 
numerical error. We now describe an alternate approach, 
a dual problem, which connects to quantum belief propa- 
gation. If this dual problem could be turned into a varia- 
tional dual problem (a concave function whose maximum 
equals the minimum of the Markov free energy) , it would 
provide mathematically rigourous lower bounds on F. 
Dual problem: quantum belief propagation — Consider a 
length-A spin chain and define density matrices p k and 
crfc associated to segments k — n to k and k — n to k — 1 
respectively, as in Fig. [3] In this case, the minimization 
problem defined at Eq. (121) becomes 

N 

(^{Pk [Hk + log p k - I ® A k - B k / + p k ] } 



o o 1 * o o) »k\o 



Tr{a k [loga k - A k _i - B k + v k ]} 



(3) 



where for k = n, . . . N , the matrices A k and Bk and the 
scalars pk and Vk are Lagrange multipliers used to enforce 
Tri/3fc = (Tfc+i, Tr n pk — a k , and the trace normalization 
of pk and Ok respectively, and An — 0. Above, Hk is 
the part of the Hamiltonian supported on sites k to k + 
n properly weighted to avoid double counting, and we 
have set temperature T = 1 to avoid cluttering equations. 
Taking derivatives with respect to pk and ak yields 



H k + log pk - 

log Ok 



IA k - 



B k l 
- Bk 



(4) 
(5) 



FIG. 3: The density matrix p k describes the state of sites 
k — n to k while the states a k is for sites k — n to k — 1, with 
n — 5 in this example. 



where v' k = v k + l and p! k = p k + 1, and we have dropped 
the symbols. These equations, together with the con- 
straints imposed on the reduced density matrices, give a 
set of self-consistent mean-field equations 

A k -i = log(Tr„p fe ) -Bk + v'k 

= log(Tr„e-^ +/Afc+Sfc/ -^) -B k + v' k (6) 
B k+ \ = log(Trip fe ) - A k + v'k+i 

= log(Tr 1 e-^+ IA ^ B " J -^) - A k + v' k+l . (7) 

One can show that any solution to these equations is a 
minimum of the Markov free energy Eq. ( |2|). Because this 
function is convex, the solution to Eqs. |6|7[ ) is unique. 

We can conceive an iterative procedure to approach 
solutions to Eqs. ( 6jT ) . Starting from an initial guess 
for the Ak and Bk, we obtain new guesses by inserting 
these values into Eqs. ( 6j7 | which provides new values, 
and recurse. Renaming Ak-i = log m k -^k-i and B k +i = 
log mfc_>.fe+i , we recognize Eqs. ( |6|7[ ) as almost the belief 
propagation prescription of [HI [T2] 

m k ^k-i oc Tr„(A fe m k+ i^ k & mk~i->k) © m k-i^ k 

(8) 

m k ^k+i oc Tr^Afc m k+ i^ k © m k -i^k) © m kli^ k 

(9) 

p k oc A k m k +i->k © rrik-i^k (10) 

where all proportionality constants can be set by normal- 
ization and Ak = exp(-Hk). The product is defined 
by A B = exp(logA + logS). We note a subtle dif- 
ference between these belief propagation equations and 
those of [12] . If the action of the partial trace and the 
product were commutative as they are in the classical 
case, the two appearances of the term nik-i^k in Eq. ^ 
would cancel, and similarly for m k +i^ k in Eq. Q. These 
cancellations were assumed in [51 [12] , based on heuristic 
arguments and numerical evidences. However, we see 
that they are required to establish a direct connection 
with the MED. Any fixed point of the iteration equations 
for messages m yields a lower bound to the free-energy 
of the system. Moreover, as in [5][TT][T2], this iterative 
procedure can be used to evaluate other quantities such 
as correlation functions. 

State reconstruction and probabilistically checkable proofs 
(PCP) — Given a global quantum state p, such that SSA 
is saturated for the given choice of Markov shields, we 
can reconstruct the global state from the local state. Us- 
ing the structure theorem of [IB], we have log(p{ < j. +1 }) — 
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FIG. 4: The blue and green regions are two different Markov 
shields for site k (the green neighborhood is not a connected 
region) . 



log(/3{< fc }) + log(p kuMk - \og(p Mk ). Iterating this pro- 
cedure allows us to reconstruct the global state from the 
local state. In the Appendix, we extend this idea and 
show that any state saturating SSA on a tree graph is 
the thermal state of a Hamiltonian that is the sum of lo- 
cal, commuting terms. This procedure may help address 
the structure of topologically ordered states, since many 
lattice models with topological order saturate SSA with 
an appropriate choice of shields [T7] (see the Appendix). 

Deciding whether the ground state energy of a clas- 
sical Hamiltonian on N particles is or greater than 
Ne for some positive constant e is a very difficult prob- 
lem. In general, it is NP-complete, by the famous PCP 
theorem [T5]. The analogous decision problem for a quan- 
tum Hamiltonian [19] is in QMA [20] , but it is not known 
to be QMA-complete (this is the quantum PCP conjec- 
ture). While this question concerns zero temperature, 
it is equivalent to determining whether the free energy 
becomes negative at temperature T < e/logd where d is 
the number of levels of each particle. It is easy to verify if 
a set of operators {A k , -Bfc} are a solution to Eqs. (6j7l, so 
the problem of lower bounding the free energy of a quan- 
tum system using the Markov entropy decomposition is 
in NP. Thus, one way to disprove the quantum PCP con- 
jecture would be to find a rigorous upper bound to this 
lower bound, e.g., by analyzing its scaling as a function of 
the size of the Markov shield. State reconstruction may 
prove useful here. 

Multi-patch MED — We now discuss a possible extension 
of our method. Let Fj^ and F\ t denote the Markov free 



energy formulas obtained from two different of neighbor- 
hoods in our procedure. Clearly, F^ ax = max/; is i 
lower bound to the free energy. The convex function 



F MEui T )= min - 



max F^{{p x }) 

k 



is an even better lower bound. That is, instead of mini- 
mizing Fij and F^ separately, we minimize their maxi- 
mum, subject to the constraint that the reduced density 
matrices used to compute the two formulas are locally 
consistent with one another. 

In particular, the shapes of ftA\ and A^2 can be cho- 
sen to capture correlations on different length scales of 
the system. Figure [4] illustrates two such choices. The 
blue region captures the short-scale entanglement (de- 
picted by a dashed line) while the green neighborhood 
captures the long-range entanglement (full line). The 
free energy formula obtained by the combination of both 
regions is forced to assign reduced density matrices com- 
patible with both type of correlations. 
Discussion — MED is on the one hand a possible nu- 
merical tool for studying the thermodynamics of quan- 
tum systems in a more accurate way than is possible 
using exact diagonalization. On the other hand, it pro- 
vides a theoretical basis for the quantum belief propaga- 
tion procedure developed previously to study disordered 
quantum systems; while we focused in translationally in- 
variant systems above, we can apply the procedure more 
generally, e.g. to quantum spin glasses [21] . treating each 
reduced density matrix px as an independent variable. 
Finally, it offers a physics-inspired procedure that may 
help tackle outstanding problems in quantum computa- 
tional complexity. 
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Appendix A: Structure of Thermal States 
Saturating Strong Subadditivity Locally 

In this appendix we discuss the structure of states 
which exactly saturate strong subadditivity. Our major 
result is a statement about the structure of such states 
on a tree graph. However, first we would like to briefly 
discuss why such states occur even in the ground state 
of finite dimensional topologically ordered lattice mod- 
els. Consider a lattice model such as those considered in 
[17] which has a vanishing correlation length but also has 
topological order. Such a model displays an interesting 
correction to the entropy, called "topological entangle- 
ment entropy" . This causes the entropy of a region to 
have a term which is proportional to the boundary of the 
region, plus a constant which depends upon the topol- 
ogy of the region. In [17] , this constant is extracted by 
considering a sum and difference of entropies over dif- 
ferent regions. However, this sum and difference exactly 
corresponds to a conditional mutual information of three 
regions A, B, C, for a particular choice of the regions. 
If we pick it so that ABC is an annulus, as shown in 
Fig. Qa. Then, the mutual information between A and 
C conditioned on B is proportional to the entanglement 
entropy term that Levin and Wen consider, and strong 
subadditivity can be used to determine the sign of the 
topological correction to the entanglement entropy. 

So, from this we learn that for certain choices of sites 
and shields in such a model we will not see a saturation 
of the conditional mutual information. Indeed, problems 
occur whenever there is a topology change. However, we 
can instead consider a case as in Fig. ([5])b in which all 
three regions are contractable. In this case, the condi- 
tional mutual information vanishes in these models and 
strong subadditivity is saturated. Thus, we can in many 
cases find a sequence of sites to add and a choice of shields 
such that strong subadditivity is saturated. In particu- 
lar, let us consider a system on a sphere. Then if we 
choose the set {1, k} to be contractible at every step 
but the last (this is why we chose the sphere), and the 
neighborhoods are chosen to be a small circle around each 
site, then strong subadditivity will be saturated at every 
stage. One can verify that when the last site is added, 
strong subadditivity is saturated also. 

Since strong subadditivity is saturated at each stage, 
this enables us to write the projector onto the ground 
state of the system as a matrix product operator with 
bounded bond dimension. To do this, we iterate the re- 
sult that saturation of strong subadditivity implies that 

1/2 -1/2 -1/2 1/2 , , , 

Pabc = PbcPb PabPb p BC ; each such operator pab 
has bounded bond dimension, and as a result the oper- 
ator pabc has bounded bond dimension. There exists 
some product state such that pabc acting on that state 
is non-zero. Applying pabc to that state then gives a 
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FIG. 5: a)Topological entanglement entropy from a config- 
uration with topology change. b)No topology change, and 
strong subadditivity is saturated. 



representation of the ground state as a matrix product 
state or PEPS (projected entangled pair state) [5]. 

After this discussion, we now turn to the structure of 
states saturating strong subadditivity on a tree, in which 
case we can prove much more about the structure of the 
states. We will prove that any such density matrix can 
be written as p = Z^ 1 exp(— i?) for H a sum of commut- 
ing local operators and Z a normalization constant. We 
prove this result as a corollary of a result which gener- 
alizes the structure theorem of [16] to tree graphs. Our 
structure theorem on tree graphs has the physical inter- 
pretation that there are two types of correlations between 
nodes on the graph. There are classical correlations, 
which can be long-ranged, but are always mediated by 
correlations between intermediate nodes, and there are 
quantum correlations which are limited to nearest neigh- 
bors. 

We begin with the special case of a one dimensional 
system, with the sites 1,2, ...,k chosen in order along a 
line. For simplicity, assume that strong subadditivity 
becomes saturated at the shortest nontrivial length scale, 
when Mt = {k — 1}. Note that if strong subadditivity 
is saturated on some larger length scale (for example, if 
we only saturate strong subadditity when Aik = {k — 
2,k — 1}), then by a rescaling of the system, grouping 
several sites into one site, we can reduce to the case when 
M k = {k-l). 

The structure theorem implies that the Hilbert space 
Hk on any site k — 1 can be decomposed as a direct sum 



Uk-l - ^^-k-l L (j) ®?4-l«(j): 



(Al) 



so that 



P{i,...,k} - (J)'//. \ /. r , ® Pk-l R (j),k) 

(A2) 

where the qk-i(j) are a probability distribution. 

Let us assume that the temperature T = 1, for 
notational simplicity. Write the density matrix p 
/ exp(-H) for some H. We will show how to write 
such an if as a sum of local, commuting operators. Let 
Pfc(j) denote the operator on site k which projects onto 
B(k)f ® B(k)f. Define 

q k (j\i) = Tr(P k (j)P k _ 1 (i)p)/Tr(P k _ 1 (i)p), (A3) 



() 



Then, 

1 N 
P=^exp[-^^], (A4) 

fc=i 

where 

ffi = -^Pi(i)ln( gi (j)], (A5) 

and 

fffc = 5Z^0')^-iW(M<Z fc (jN)) +HPk-l«M)) 

(A6) 

for fc > 1. The operators commute for different fc due 
to the tensor product structure of Hilbert spaces B{k)^® 
B(k) R , and they are local as required. We omit a proof 
that this procedure is correct, since it is a special case of 
our more general result on trees, below. 

We now describe a similar procedure which can be ap- 
plied to any tree graph. First, some definitions. We 
define a density matrix pabc to be a Markov chain 
A — B — C if strong subadditivity is saturated, so that 
S{C\BA) = S(C\B). We define a density matrix on a 
multi-partite system to be a Markov network if there is 
a graph, with each subsystem corresponding to a node of 
the graph, such that, given any three disjoint sets A, B, C 
of nodes of the graph such that all paths from any node 
in A to any node in C must past through a node in B, the 
density matrix pabc is a Markov chain on A — B — C. 
We will later consider tree graphs, with nodes labelled 
1, ...,N. Let node 1 be called the "root" of the tree. For 
each node other than the root, the "parent" of that node 
is considered to be the neighbor of that which is closer 
than the root, and the daughters are considered to be the 
other neighbors of that graph. We let p{i) be the parent 
function: p(i) is the parent node of node i if i > 1. Let 
the nodes be ordered such that if i < j then the path 
from node i to the root does not pass through node j 
(i.e., node i is not a daughter, grand-daughter, etc... of 
node j). We say that such a tree is a "Markov tree" if, 
for each node k > 1 we have 

S(k\{< fc}) = 5(fc|{< fc} n 7V fe ) = S{k\M k ), (A7) 

where the Markov shield Afk of fc is the parent of node 
fc. Note that a Markov tree is simply a Markov network 
that is a tree graph; while we have defined Markov trees 
with a particular choice of root, they would be Markov 
trees for any choice of the root. 

With these definitions, we will prove a result which ex- 
tends the quantum Hammersley- Clifford theorem derived 
in0: 

Theorem 1. Any Markov network on a tree can be ex- 
pressed as p = ^ exp(— H) where H is the sum of local, 
commuting terms. 



Proof. This is a corollary of theorem ([3]) proven below as 
the operators Hi in that theorem are local and commut- 
ing. □ 

The following Lemma will be useful in proving theorem 
©• 

Lemma 1. Let pabc be a Markov chain on A — B — C 
and suppose that P is a projector onto a subspace of Hb 
such that [P, pabc] — 0. Then PpabcP is also a Markov 
chain on A — B — C . 

Proof. We can write pabc — Pab^c © Pab 2 c- Satura- 
tion of SS A is equivalent [8] to the equality log pabc = 
log pab + log pbc — logPs- The proof follows from the 
fact that log(X © Y ) = log X © log Y. □ 

We first prove a special case of our result on a tree, 
which can be thought of as a generalization of the struc- 
ture theorem. Since we will use this terminology later, 
first make a definition. Given a multi-partite state p on N 
subsystems, labelled 1, ...,N and referred to as "nodes", 
define a splitting of node fc to be a decomposition of 
the Hilbert space Ht on node fc as 

Wfc = 0«fc(j). (A8) 

j 

where each Hilbert space Hk(j) can be decomposed into 
a tensor product 

n k (j)= (g) u k ^iU), (A9) 

i^k,l<i<N 

such that the density matrix p can be expressed as 

p = 0«(j) pu^u)^ ( Al °) 

j i^k,l<i<N 

where p~Hk-n(j),i 1S a density matrix on 1-Lk^i{j) and i. 
We now prove that 

Theorem 2. Consider any Markov tree with N nodes, 
such that all nodes, other than the root, are daughters of 
the root. Then, there exists a splitting of the root. 

Proof. The proof is inductive. Let node 1 be the root 
to simplify notation. Assume that we have proven the 
theorem when the Markov tree has only N — 1 nodes 
(the case TV = 3 is the structure theorem of [H]). Apply 
the structure theorem with the three subsystems A — 
{2,...,N ~ 1},B = {1},C = {N} to show that there 
exists a decomposition of the Hilbert space T-L\ on node 
1 into Hi = 0j. B(j), where B(j) = B(j) L ® B(j) R with 

P = q(j)pA,B(j) L ®PB(j) R ,c- % lemma Q, Pa,b(j) l ® 
Pb(j) r .c is a Markov tree. Thus, pA,B(j) L is aMarkov tree 
on a graph of N — 1 nodes. Thus, applying the inductive 
assumption, there exists a decomposition of B(j) L into a 
direct sum of Hilbert spaces 

bCj) £ = 0?M*). (ah) 

k 



where each Hilbert space Hij(k) can be decomposed into 
a tensor product 

Hij(k) = <g)H (ld) ^i(k), (A12) 

i>2 

such that the density matrix Pa,b(j) l can be expressed 
as 

N-l 

Pa,b(j) l = r j( fc ) Pu (1 , 3) ^{k)^ (A13) 

k i>2 

for some probability distribution rj(k). Then, let 

U 1 ((j,k))=H hj (k)®B(j) R , (A14) 

so that 

N-l 

= ®H {ltj) ^ i (k)®B(J) R . (A15) 

i>2 

Treating the two indices j, k as a single index, this gives 
a splitting for N sites. □ 

We now prove a structure theorem for trees: 

Theorem 3. Consider a tree graph with N different 
nodes, labelled 1,...,N, forming a Markov tree. Then, 
for each node k there exists a decomposition of the Hilbert 
space T-L k on that node into a sum of Hilbert spaces 

H k = @n k (j), (A16) 

i 

where each Hilbert space H k {j) can be decomposed into a 
tensor product 

n k (j) = (g)H k ^ i (j), (A17) 

i 

where the product ranges over nodes i which are neighbors 
of node k, such that the following properties hold. We use 
Pk(j) t° denote the projector onto H k (j), we use q k (j) 
to denote Tr(pP k (j)), and we define 

q k (j\i) = Tr(P k (j)P pik) (i)p)/Tr(P p{k) (i)p). (A18) 

Then, the density matrix p can be expressed as 

N 

p = exp (A19) 

where 

#i = -£Pi(j)ln( gi (j)], (A20) 

3 

and 

H k = Eij Pk(j)P p ( k )(i)( Hlk(j\i)) + (A21) 

(Note that in case q k (j) = f or an V k, we define 
exp[ln(q k (j))] = and define conditional probabilities in 
which q k {j) appears in the denominator arbitrarily.) 
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FIG. 6: Coarse graining procedure. 



Proof. For each node k, consider the subgraph consist- 
ing of k and all of its neighbors. Let the decomposition 
H k (j) = &>i ^fc-s-i(j) m the statement of this theorem be 
the splitting given in the previous theorem for the given 
subgraph. 

For use later, we define a new coarse-grained graph, as 
follows. Let k be the root of the new graph, labelled k. 
For each neighbor i of node k on the original graph, group 
that neighbor and all nodes connected to that neighbor 
by a path that does not go through node k into one node 
on the new graph, and label that new node i. See Fig. 
We claim that the splitting above also provides a split- 
ting on the coarse-grained graph. This holds because 
the density matrix on the coarse-grained graph can be 
constructed from the density matrix on the subgraph by 
applying a super-operator which is a product of super- 
operators on each of the nodes as follows. Let p kt {i<z n (k)} 
be the density matrix on k and its neighbors i, tensored 
with the identity on the remaining nodes. Let n(k) be 
the set of neighbors of k. We have 

pk,{i)={ n p\ i2 p^ i/2 )pk,{^n( k )}{ n pi i/2 p-/ 2 ) 

(A22) 

by strong subadditivity, so the splitting on node k is a 
splitting on the coarse-grained graph. 

Let P k {j) project onto H k (j). Consider any sequence 
of integers ji, jat. Define 

Pin, ..., JN ) = Tv(P 1 ( J1 )...P N (j N ) P ) (A23) 

and 

P{jl, -,3N) 

(A24) 

The state p(ji, Jn) is non-zero only on Hi(ji) ® ... ® 
HnUn), where we claim that it is equal to a prod- 
uct state [it is a product of states on H k ->. p ( k )(j k ) <8> 
%p(fe)-»fc(ip(fe)) over all k]. We will prove this claim induc- 
tively, by proving that given any tree of N nodes, such 
that for each node of the tree we have a splitting of that 
node with corresponding projectors Pi(ji), then the state 
p(jii •••! Jn) has the given product form. Assume it is true 
on any tree of at most N — l nodes. Assume, without loss 
of generality, that node 1 has at least two neighbors (if 
no node has more than two neighbors, we are at the case 
N = 2 which is trivial). The decomposition Pi(j'i) gives 
a splitting of node 1 on the coarse-grained graph. So, the 
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state Pi(ji)pPi(ji) /Tr(Pi(ji)p) is equal to a product of 
states &)~ i p- Hl F° r anv h we consider a tree given 
by the nodes in the original tree which are in i in the 
coarse-grained t ind by the space T-L-^^i, considered 
as a single node. This tree has at most N—l nodes. The 
splitting that we had on nodes 2, ...,7V on the original 
tree provides a splitting on the nodes on the new tree in 
the natural manner. To see this, note that for any node 
k, if the state Pk{jk)pPk{jk) is a product state on the 
coarse-grained graph with k as the root, then the state 
Pi(ji)Pk(jk)pPk(jk)Pi(ji) is also a product state. Thus, 
since we have a splitting on the new tree, the state on 
the new tree has the product structure, so jw) 
does indeed have the product structure that we claim. 
One may directly verify that 



So, it suffices to show that the given Hamiltonian ^ fc Hk 
produces the correct normalization so that 



P(j u ...,j N ) - Tr(p 1 (j 1 )...P jv (j JV )^ exp(- H k)) ■ 

k 

(A26) 

However, the probability distribution P\(ji, ■ •■,Jjv) is a 
classical Markov tree and so by the Hammersley-Clifford 
theorem the desired result follows (this result can also 
be proven inductively in roughly the same way as the 
previous paragraph). □ 



Pi(ji)-PN{jN)exp(-'^2H k )P 1 (j 1 )...P N (j N ) oc p{ji,...,j N )- 

k 

(A25) 



